Search CORE

162 research outputs found

Partner Selection for the Emergence of Cooperation in Multi-Agent Systems Using Reinforcement Learning

Author: Anastassacos Nicolas
Hailes Stephen
Musolesi Mirco
Publication venue
Publication date: 28/11/2019
Field of study

Social dilemmas have been widely studied to explain how humans are able to cooperate in society. Considerable effort has been invested in designing artificial agents for social dilemmas that incorporate explicit agent motivations that are chosen to favor coordinated or cooperative responses. The prevalence of this general approach points towards the importance of achieving an understanding of both an agent's internal design and external environment dynamics that facilitate cooperative behavior. In this paper, we investigate how partner selection can promote cooperative behavior between agents who are trained to maximize a purely selfish objective function. Our experiments reveal that agents trained with this dynamic learn a strategy that retaliates against defectors while promoting cooperation with other agents resulting in a prosocial society.Comment:

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Association for the Advancement of Artificial Intelligence: AAAI Publications

Identifying vulnerabilities of industrial control systems using evolutionary multiobjective optimisation

Author: Hailes Stephen
Tuptuk Nilufer
Publication venue: 'Elsevier BV'
Publication date: 01/02/2024
Field of study

In this paper, we propose a novel methodology to assist in identifying vulnerabilities in real-world complex heterogeneous industrial control systems (ICS) using two Evolutionary Multiobjective Optimisation (EMO) algorithms, NSGA-II and SPEA2. Our approach is evaluated on a well-known benchmark chemical plant simulator, the Tennessee Eastman (TE) process model. We identified vulnerabilities in individual components of the TE model and then made use of these vulnerabilities to generate combinatorial attacks. The generated attacks were aimed at compromising the safety of the system and inflicting economic loss. Results were compared against random attacks, and the performance of the EMO algorithms was evaluated using hypervolume, spread, and inverted generational distance (IGD) metrics. A defence against these attacks in the form of a novel intrusion detection system was developed, using machine learning algorithms. The designed approach was further tested against the developed detection methods. The obtained results demonstrate that the developed EMO approach is a promising tool in the identification of the vulnerable components of ICS, and weaknesses of any existing detection systems in place to protect the system. The proposed approach can serve as a proactive defense tool for control and security engineers to identify and prioritise vulnerabilities in the system. The approach can be employed to design resilient control strategies and test the effectiveness of security mechanisms, both in the design stage and during the operational phase of the system

UCL Discovery

Identifying Vulnerabilities of Industrial Control Systems using Evolutionary Multiobjective Optimisation

Author: Hailes Stephen
Tuptuk Nilufer
Publication venue
Publication date: 26/05/2020
Field of study

In this paper we propose a novel methodology to assist in identifying vulnerabilities in a real-world complex heterogeneous industrial control systems (ICS) using two evolutionary multiobjective optimisation (EMO) algorithms, NSGA-II and SPEA2. Our approach is evaluated on a well known benchmark chemical plant simulator, the Tennessee Eastman (TE) process model. We identified vulnerabilities in individual components of the TE model and then made use of these to generate combinatorial attacks to damage the safety of the system, and to cause economic loss. Results were compared against random attacks, and the performance of the EMO algorithms were evaluated using hypervolume, spread and inverted generational distance (IGD) metrics. A defence against these attacks in the form of a novel intrusion detection system was developed, using a number of machine learning algorithms. Designed approach was further tested against the developed detection methods. Results demonstrate that EMO algorithms are a promising tool in the identification of the most vulnerable components of ICS, and weaknesses of any existing detection systems in place to protect the system. The proposed approach can be used by control and security engineers to design security aware control, and test the effectiveness of security mechanisms, both during design, and later during system operation.Comment: 25 page

arXiv.org e-Print Archive

UCL Discovery

Detection of explosive markers using zeolite modified gas sensors

Author: Binions Russell
Hailes Stephen M.V.
Parkin Ivan P.
Peveler William J.
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 18/12/2012
Field of study

Detection of hidden explosive devices is a key priority for security and defence personnel around the globe. Electronic noses, based on metal oxide semiconductors (MOS), are a promising technology for creating inexpensive, portable and sensitive devices for such a purpose. An array of seven MOS gas sensors was fabricated by screen printing, based on WO3 and In2O3 inks. The sensors were tested against six gases, including four explosive markers: nitromethane, DMNB (2,3-dimetheyl-2,3-dinitrobutane), 2-ethylhexanol and ammonia. The gases were successfully detected with good sensitivity and selectivity from the array. Sensitivity was improved by overlaying or admixing the oxides with two zeolites, H-ZSM-5 and TS-1, and each showed improved responses to –NO2 and –OH moieties respectively. Admixtures in particular showed promise, with excellent sensitivity and good stability to humidity. Machine learning techniques were applied to a subset of the data and could accurately classify the gases detected, even when confounding factors were introduced

Crossref

UCL Discovery

Enlighten

Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning

Author: Hailes Stephen
Musolesi Mirco
Tennant Elizaveta
Publication venue
Publication date: 01/01/2023
Field of study

Practical uses of Artificial Intelligence (AI) in the real world have demonstrated the importance of embedding moral choices into intelligent agents. They have also highlighted that defining top-down ethical constraints on AI according to any one type of morality is extremely challenging and can pose risks. A bottom-up learning approach may be more appropriate for studying and developing ethical behavior in AI agents. In particular, we believe that an interesting and insightful starting point is the analysis of emergent behavior of Reinforcement Learning (RL) agents that act according to a predefined set of moral rewards in social dilemmas. In this work, we present a systematic analysis of the choices made by intrinsically-motivated RL agents whose rewards are based on moral theories. We aim to design reward structures that are simplified yet representative of a set of key ethical systems. Therefore, we first define moral reward functions that distinguish between consequence- and norm-based agents, between morality based on societal norms or internal virtues, and between single- and mixed-virtue (e.g., multi-objective) methodologies. Then, we evaluate our approach by modeling repeated dyadic interactions between learning moral agents in three iterated social dilemma games (Prisoner's Dilemma, Volunteer's Dilemma and Stag Hunt). We analyze the impact of different types of morality on the emergence of cooperation, defection or exploitation, and the corresponding social outcomes. Finally, we discuss the implications of these findings for the development of moral agents in artificial and mixed human-AI societies.Comment: 7 pages, currently under review for a conferenc

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Planning spatial networks with Monte Carlo tree search

Author: Darvariu Victor-Alexandru
Hailes Stephen
Musolesi Mirco
Publication venue: 'The Royal Society'
Publication date: 16/02/2022
Field of study

We tackle the problem of goal-directed graph construction: given a starting graph, finding a set of edges whose addition maximally improves a global objective function. This problem emerges in many transportation and infrastructure networks that are of critical importance to society. We identify two significant shortcomings of present reinforcement learning methods: their exclusive focus on topology to the detriment of spatial characteristics (which are known to influence the growth and density of links), as well as the rapid growth in the action spaces and costs of model training. Our formulation as a deterministic Markov decision process allows us to adopt the Monte Carlo tree search framework, an artificial intelligence decision-time planning method. We propose improvements over the standard upper confidence bounds for trees (UCT) algorithm for this family of problems that addresses their single-agent nature, the trade-off between the cost of edges and their contribution to the objective, and an action space linear in the number of nodes. Our approach yields substantial improvements over UCT for increasing the efficiency and attack resilience of synthetic networks and real-world Internet backbone and metro systems, while using a wall clock time budget similar to other search-based algorithms. We also demonstrate that our approach scales to significantly larger networks than previous reinforcement learning methods, since it does not require training a model

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Significance Test for Inferring Affiliation Networks from Spatio-Temporal Data.

Author: Furmston Thomas
Hailes Stephen
Morton A Jennifer
Publication venue: PLoS One
Publication date: 01/01/2015
Field of study

Scientists have long been interested in studying social structures within groups of gregarious animals. However, obtaining evidence about interactions between members of a group is difficult. Recent technologies, such as Global Positioning System technology, have made it possible to obtain a vast wealth of animal movement data, but inferring the underlying (latent) social structure of the group from such data remains an important open problem. While intuitively appealing measures of social interaction exist in the literature, they typically lack formal statistical grounding. In this article, we provide a statistical approach to the problem of inferring the social structure of a group from the movement patterns of its members. By constructing an appropriate null model, we are able to construct a significance test to detect meaningful affiliations between members of the group. We demonstrate our method on large-scale real-world data sets of positional data of flocks of Merino sheep, Ovis aries

Crossref

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

Cooperation and Reputation Dynamics with Reinforcement Learning

Author: Anastassacos Nicolas
García Julian
Hailes Stephen
Musolesi Mirco
Publication venue
Publication date: 01/01/2021
Field of study

Creating incentives for cooperation is a challenge in natural and artificial systems. One potential answer is reputation, whereby agents trade the immediate cost of cooperation for the future benefits of having a good reputation. Game theoretical models have shown that specific social norms can make cooperation stable, but how agents can independently learn to establish effective reputation mechanisms on their own is less understood. We use a simple model of reinforcement learning to show that reputation mechanisms generate two coordination problems: agents need to learn how to coordinate on the meaning of existing reputations and collectively agree on a social norm to assign reputations to others based on their behavior. These coordination problems exhibit multiple equilibria, some of which effectively establish cooperation. When we train agents with a standard Q-learning algorithm in an environment with the presence of reputation mechanisms, convergence to undesirable equilibria is widespread. We propose two mechanisms to alleviate this: (i) seeding a proportion of the system with fixed agents that steer others towards good equilibria; and (ii), intrinsic rewards based on the idea of introspection, i.e., augmenting agents' rewards by an amount proportionate to the performance of their own strategy against themselves. A combination of these simple mechanisms is successful in stabilizing cooperation, even in a fully decentralized version of the problem where agents learn to use and assign reputations simultaneously. We show how our results relate to the literature in Evolutionary Game Theory, and discuss implications for artificial, human and hybrid systems, where reputations can be used as a way to establish trust and cooperation.Comment: Published in AAMAS'21, 9 page

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna